Acceptance sampling plan based on difference in difference estimator with application

An acceptance sampling plan has been designed in this study based on the Difference-in-Difference estimator. This plan is designed for the inspection of those product units whose life follows the normal distribution. The operating characteristic function is discussed for the two respective cases of the standard deviation known and unknown. The parameters of the proposed plan are determined by minimizing the sample size and followed by the satisfying optimization rule. The results are computed and tabulated for various parametric combinations of acceptable quality levels and limiting quality levels. The computations are performed by using R statistical programming software for all respective cases. The real-life application of the proposed sampling plan has been discussed and elaborated in detail.


Acceptance sampling plan based on difference in difference estimator with application
Muhammad Azam 1 , Maira Ahsan Khan 2 , Asma Arshad 3 , Muhammad Saleem 4 & Muhammad Aslam 5* An acceptance sampling plan has been designed in this study based on the Difference-in-Difference estimator.This plan is designed for the inspection of those product units whose life follows the normal distribution.The operating characteristic function is discussed for the two respective cases of the standard deviation known and unknown.The parameters of the proposed plan are determined by minimizing the sample size and followed by the satisfying optimization rule.The results are computed and tabulated for various parametric combinations of acceptable quality levels and limiting quality levels.The computations are performed by using R statistical programming software for all respective cases.The real-life application of the proposed sampling plan has been discussed and elaborated in detail.
An essential component of industrial production is the process of inspecting the finished goods through acceptance sampling plans after they are manufactured, as these plans are characterized by lot sentencing procedures.Dealing with these sampling plans is crucial since they will offer a better method for determining the minimum sample size in each scenario for lot sentencing and whether it is worthwhile to be accepted or rejected.To preserve or assert a certain level of reputation regarding the manufacturing lot, the industries are susceptible to passing through their ready or final production lots through suitable acceptance sampling plans [1][2][3][4] .The field of statistical process control (SPC) offers tools such as acceptance sampling plans that are better suited for a range of industrial settings 5 .In this situation, producers want to favor the minimum sample size acceptance sampling plans since they are eager to prevent needless inspection expenses and effort 6 .As a result, plans that align with production conditions are implemented.Producers are constantly interested in using effective, remarkable sample plans that need less inspection time and money to decrease losses and maintain high standards of quality 7 .
Similar to how effective sampling plans in production lines are crucial for quick lot sentencing, acceptancesampling plans are among the most critical instruments for this purpose 6 .Plans come in two varieties: variable acceptance sampling plans and attribute plans.The literature on SPC is highly comprehensive and offers suitable acceptance sampling plans tailored to the requirements of the production settings.Earlier ones are thought to have easier industrial implementations among them.However, today's producers are more picky, using more data and being more sensitive to achieve the desired outcomes; for this reason, variable sampling plans are preferred over attribute sampling plans 8 .Since extraneous factors naturally disrupt manufacturing, there are two risks associated with lot sentencing that cannot be avoided: accepting the good lot and rejecting the bad one 7 .There are numerous real-world situations when having some extra, antecedent, or auxiliary knowledge can significantly enhance the decision-making process regarding lot sentencing methodology.Quality control literature favors the utilization of additional information in support of the study variable y.
As 8 utilized the single auxiliary variable in designing of acceptance sampling plan and successfully achieved improved results.The literature on statistical process monitoring control charts by using a single auxiliary variable is also quite enriched.A detailed discussion of this concept can be seen in 9-14 and 15 .As 16 presented auxiliary information-based arithmetic mean control chart 17 .Proposed auxiliary information EWMA control chart 18 .Developed the use of multivariate regression-based mean and variance control charts 19 .Constructed an individual monitoring control chart that is based on cluster-regression methods.Recently 20 , introduced the concept of auxiliary information to enhance the more proficient way of an acceptance sampling plan to accept the lot 21 , designed the economical way of an acceptance sampling plan despite having inspection errors 22 , utilized the regression estimator concept in developing the EWMA statistic based acceptance sampling plan 23 , employed the regression estimator in the presence of uncertainty to create a successful plan 24 , suggested repeated sample acceptance sampling strategy based on multiple dependent state sampling, both with and without the use of an auxiliary dataset to demonstrate the effectiveness of the suggested plan 25 , recommended product acceptance by taking into account the liner profiles of two suppliers using Wang's test statistic to determine the required minimum sample size to obtain an effective plan.Moreover [26][27][28][29][30][31][32][33][34] , and 35 introduced more of this kind of concept procedure, which works well under certain production environments.
The new contribution of the presented research is the development of a new acceptance sampling plan when more than one auxiliary variable is demanded by both the producer and the consumer to estimate the study variable or the variable of interest in a much more precise manner to satisfy the risks respectively.The new index with three auxiliary variables used in this paper methodology is novel, and the use of the difference-in-difference estimator approach produces better results in terms of the smaller sample size needed to make a better decision of lot sentencing than the other existing acceptance sampling plans now in literature.To the best of the authors' knowledge, no attempt has been made to create an acceptance sampling plan in the literature.This leads to the creation of the suggested study.The difference estimator plays a significant role in the statistical literature, particularly in situations where the goal is to estimate the difference between two population parameters.Here are some contributions of the concept in statistical process control as acceptance sampling plans and examples as follows: Treatment Effects in Experimental Studies: Difference estimators are commonly used to assess treatment effects in experimental studies.They help quantify the impact of a treatment or intervention by comparing the outcomes.Example: In a clinical trial, a new drug's effectiveness is evaluated by measuring the difference in health outcomes between patients who received the drug and those who received a placebo to make a decision about the effectiveness of treatment.Before-and-After Studies: Difference estimators are valuable in assessing changes over time, as they quantify the difference in outcomes before and after an intervention or policy change.Example: A city implements a traffic management system, and the difference in traffic congestion levels before and after the system's implementation is used to estimate the system's impact.Paired Observations and Matched Samples: Difference estimators are commonly employed when dealing with paired observations or matched samples.They help account for individual variability by focusing on the differences within pairs or matched groups.Example: In a study comparing the effectiveness of two teaching methods, the difference in test scores for each student is calculated, and the average difference is used as the estimator.Economic Studies-Control Groups: Difference estimators are essential in economic studies, especially when dealing with observational data.They help control for unobserved factors by comparing differences in outcomes over time or between groups.Example: Evaluating the impact of a policy change on employment rates by comparing the differences in employment levels before and after the policy change.Quasi-Experimental Designs: In situations where true randomization is challenging, difference estimators are valuable in quasi-experimental designs, helping to control for confounding variables.Example: Assessing the impact of a new teaching method in a school where random assignment is not feasible, by comparing the difference in performance between classes that adopted the new method and those that did not.In summary, the difference estimator contributes to estimating and interpreting the impact of interventions, policy changes, or experimental conditions by quantifying the differences in outcomes, making it a versatile tool in various fields of study.
To develop an acceptance sample plan, we have incorporated in this study the use of two auxiliary variables in addition to the variable of interest.No acceptance sample strategy is created with two auxiliary variables, according to the authors' information.Thus, the suggested methodology, which was first put forth by 36 , uses the auxiliary data in the form of a Difference-in-difference (DID) estimator to examine the variable of interest.The minimum necessary lot inspection sample size is used to assess the proposed concept's competency.The remaining portions of the paper are divided as follows: The conceptual foundation of the DID estimator and the suggested plan's approach are covered in Sections "Methods" and "Results discussion".The real-world example and the determined findings from the simulation runs are described in Section "Results findings", and the concluding observations are contained in Section "Conclusions". Figure 1 displays the methodology's flowchart, while Table 1 lists the key symbols and notations before the Section "Introduction" introduction.

Methods
In this section, the procedures opted to be discussed for the designs of the proposed sampling plan are mentioned here.The underlying problem statement and strength that need to be addressed are as follows: At times, the producers seem it complicated to change the sampling schemes with a complex one than simply the simple random sampling scheme, and also the producers believe that various auxiliary information is readily available as part of the production system to use them in improving the precision of the study variable estimate than the usage of complex sampling schemes.This philosophy provides a better understanding of an acceptance sampling scheme as far as the utilization of various auxiliary variables in the estimation does not hamper the sampling cost and gives much-improved results to define defectives as defective items as well.
The proposed acceptance sampling plan is based on the DID estimator introduced by 36 , as follows: The variable y is the study variable while x and z are the auxiliary variables.There are l random samples drawn for each variable y ij, x ij, z ij : i = 1, 2, 3, . . ., l; j = 1, 2, 3, . . ., n with a sample of size n and taken as the (1) tri-variate normal random variable which follows the normal distribution having mean µ _ as a vector of l values given as under: (2)  www.nature.com/scientificreports/and the variance-covariance matrix _ concerning the corresponding y, x, and z variables are as follows: Here, µ y , µ x and µ z are the population means of y, x, and z for the l terms and σ 2 y , σ 2 x and σ 2 z are the population variances, while ρ yx , ρ yz and ρ xz are the possible correlation coefficients between the respective three vari- ables.The mean and variance of the DID estimator are given as: and The acceptance sampling plan that has been suggested is for the two cases of σ known and unknown, which are examined independently as follows: Case 1 In σ known circumstances.
In terms of the single sampling-based acceptance sampling plan, the suggested sampling plan design implements the 36 in statistical process control.It is suggested to use the following operating characteristic (OC) function with a smaller sample size (n) than the typical sampling schemes currently in use: Here, E is the value of the acceptable, approved statistic provided as: Here, k is the acceptance number that will be ascertained via simulation runs, and USL is the upper specifica- tion limit set by the manufacturer.

Design
The following are the steps involved in implementing the suggested sample design: Step 1 Ascertain the critical risk values for producers and consumers, denoted as α and β , in addition to USL .Compute the acceptance statistic E using a tri-variate simple random sample of size n from the lot.
Step 2 Take the following stance on the lot's disposition:: (1) If E ≥ k , accept the inspected lot.

The OC function becomes
As per 37 , it is assumed that if the population under study follows the normal distribution then the distribution of the Y DID will also follow the normal distribution.
Hence, it can be shown as The probability of acceptance is: The z p and �(.) represents the p th percentile and the cumulative density function (CDF) following the standard normal distribution.
Rejecting a good lot and accepting a bad lot are the two risks associated with the lot sentencing procedure that are present in the acceptance sampling strategy.Let's designate α as the producer's risk and β as the consumer's risk.We should also designate p 1 as the acceptable quality level (AQL) and p 2 as the limiting quality level (LQL).The suggested plan is based on certain plan parameters that were found in the simulation process' output so (3) www.nature.com/scientificreports/ that the two connected points ( α, p 1 ) and ( β, p 2 ) pass through the center of the curve end to end and satisfy the subsequent two-fold optimization procedures with the smallest possible sample size (n).
The presented plan is turned into the existing plan when ρ uv = ρ xy = ρ yx = ρ = 0 then becomes a special case and in this case Y DID = Y Case 2 In an unknown circumstance.Similar to real-world scenarios, the majority of the time the σ is unknown and can be calculated using the sample standard deviation (S) .With the following steps, we suggest the sampling strategy for the σ unknown situation in adaptation: Step 1 Take a random sample of size n from a lot and obtain the mean characteristic y .Then, calculate the following statistics: and Step 2 accept the lot if E * ≥ k , otherwise reject the lot.The following is the derivation of the OC curve function for the acceptance probability/likelihood of the suggested plan: Now, according to 37 : We know that where So, (11 Lastly, in the case where σ is unknown, the OC function can be expressed as follows: www.nature.com/scientificreports/As in case 1, the underlying optimization conditions will be followed to determine the plan parameters.The plan equations are subjected to the following constraints to minimize n: and The values of n and k are found for various combinations of AQL and LQL and different values of ρ , like the case when σ is known.We saw the corresponding tendency from Table 3, which is explained below: 1.For the fixed values of p 1 , sample size n decreases as p 2 increases.For instance, p 1 = 0.001 at ρ = 0.7 for p 2 = 0.004, n = 201; forp 2 = 0.006, n = 110; for p 2 = 0.008, n = 77 .But n = 30 for p 2 = 0.020.2. For the fixed values of p 2 , sample size n increases as p 1 decreases.For instance, p 2 = 0.020 at ρ = 0.7 for p 1 = 0.001, n = 30; forp 1 = 0.0025, n = 51; for p 1 = 0.005, n = 97.3.As the value of the ρ decreases the value of sample size increases.For instance, at p 1 = 0.001 and p 2 = 0.006 at ρ = 0.9 the n = 103; atρ = 0.7then = 110 and at ρ = 0.5 the n = 117.

Algorithm
The proposed plan design parameters are n and k under the single sampling scheme whose detail is described in section "Design".The algorithmic steps to elaborate the flow of computing the proposed plan parameters are as follows (Fig. 1 explains the algorithm via a flow-chart): Step 1 Specify the values of p 1 , p 2 and ρ.
Step 2 Generate 100,000 values of n and k from the uniform distribution.
Step 3 Computation of values of OC functions against p 1 , p 2 and ρ.
Step 4 Find out the values of n and corresponding k that satisfy the plan equations.
Step 5 Find out the smallest values of n and corresponding k obtained in Step 4.
Step 6 The least possible value of n and k is obtained for 100,000 times simulation results.
Step 7 Choose the lowest value of n and corresponding k as the computed values from Step 6.
The concept needs to be addressed with theoretical significance and practical contributions which are provided as follows:

Theoretical significance
1. Increased Precision Using three auxiliary variables in an estimator can enhance precision by incorporating more information, leading to more reliable and accurate estimates.2. Bias Reduction The inclusion of three auxiliary variables may reduce bias, making the estimator more robust and less susceptible to biases inherent in simpler models as discussed by … for developing the three auxiliary variables-based estimator to estimate the study variable.3. Model Flexibility Three auxiliary variables provide greater flexibility in modeling complex relationships, allowing for a more nuanced understanding of the underlying dynamics.That's why researchers are making their focus to use auxiliary information-based estimators rather than utilizing complex sampling procedures which will compromise the lot sentencing cost as well.

Practical contribution
1. Improved Predictive Power The use of three auxiliary variables can enhance predictive modeling, contributing to better predictions of the target variable.2. Variable Selection The three auxiliary variables can aid in identifying relevant factors, contributing to betterinformed decision-making and understanding of the studied system.3. Generalization to Multivariate Cases Having three auxiliary variables is crucial for extending models to multivariate scenarios, capturing interactions between multiple variables. ( In summary, employing an estimator with three auxiliary variables offers theoretical advantages such as increased precision and reduced bias, while practical contributions include improved predictive power, better variable selection, and the ability to handle more complex, multivariate situations.

Results discussion
In this paper, an attempt has been made to offer a comparison picture with the approximation approach as far as the acceptance sample plans are concerned, since, to the best of the authors' knowledge, no comparable method exists.In addition, the comparative study is provided with two auxiliary information-based acceptance sampling plans, as previously mentioned.This is because no such effort has been made in the literature for the three variables usage to estimate the study variable as provided in the proposed design.
This section is divided into three sections: the specific case of the suggested plan, which explains its superiority over the existing sample plans in the literature, and a comparison study with the 38 and 39 .Furthermore, it was discovered that the suggested sampling plan worked better than the current sampling plans, such as 38,39 , Table 3. Plan parameters of single sampling plan using DID estimator when σ is unknown at different values of ρ xz = ρ xy = ρ yz = ρ.www.nature.com/scientificreports/and the current sampling plans under the unique situation when ρ = 0. Figure 2 provides a graphic summary of the comparative analysis, demonstrating how the suggested plan outperformed the current plans in every way.The proposed concept is much more suitable than the statistical process control acceptance sampling plans as far as economic studies and quasi-experimental designs, helpful to control confounding variable conditions.The suggested method requires a smaller sample size (ASNs) to be implemented for the lot sentencing procedure, according to the graphic display.This display is built using the known scenario σ = 0.8 and AQL = 0.001 fixed.The following is a thorough discussion of the comparative analysis between each current plan and the proposed plan: A comparative study with 38 Firstly, a comparison has been made with the existing 38 to prove the argument that the proposed concept outperformed the existing sampling plans.The comparative study results are shown in two respective tables for both σ known and unknown cases, such as Tables 4 and 5.In Table 4, the results are being considered for the p 1 = 0.001 and p 2 = 0.002, 0.003, 0.004&0.006for ρ = 0.8, 0.6&0.4 .Similarly, in Table 5 the results are shown for the p 1 = 0.0025 and p 2 = 0.010, 0.015, 0.020&0.025for ρ = 0.8, 0.6&0.4 as there is a relationship between y and x where x is treated as a piece of auxiliary information.Hence, the findings can be explained as follows: • For the p 1 = 0.001 and p 2 = 0.002, 0.003&0.006at σ known case from Table 4, the proposed plan gives n = 57, 22&8 , for ρ = 0.8 while existing plan gives size n = 153, 59&21 , whereas for ρ = 0.6 the proposed plan gives n = 106, 42&15 while existing plan gives n = 172, 67&24 .The same parametric behavior can be observed for the other parametric values as shown in Table 4. • For the p 1 = 0.0025 and p 2 = 0.010, 0.015&0.020for the σ unknown case from Table 5, the proposed plan gives n = 132, 71&49 , for ρ = 0.8 while existing gives size n = 150, 82&56 , whereas for ρ = 0.6 the proposed plan gives n = 141, 76&53 while existing plan gives n = 154, 84&58 .The same can be seen for the other parametric values as shown in Table 5.
To demonstrate the concept's effectiveness, two auxiliary-information-based sampling plan is compared where successive occasions were targeted to estimate the study variable.To the best of the authors' knowledge,

Figure 1 .
Figure 1.Flow Chart of the Proposed DID Sampling Plan Methodology.

Figure 2 .
Figure 2. Efficiency Comparison of Existing Plans with Proposed DID Sampling Plan.
EWMA Exponentially Weighted Moving Average OC Operating Characteristic Y DID Population Mean through DID Estimator y Sample Mean of Variable y µ y Population Mean of Variable y µ x Population Mean of Variable x x Sample Mean of Variable x µ z Population Mean of Variable z z Sample Mean of Variable z β yx Regression Coefficient between y and x β xz Regression Coefficient between x and z β yz Regression Coefficient between y and z σ 2 x Population Variance of variable x σ 2 y Population Variance of variable y σ 2 z Population Variance of variable z ρ yx Population Correlation between Variable y and x ρ yz Population Correlation between Variable y and z ρ xz Population Correlation between Variable x and z Vol:.(1234567890)Scientific Reports | (2023) 13:22305 | https://doi.org/10.1038/s41598-023-49786-8

Table 2 .
Plan parameters of single sampling plan using DID estimator when σ is known at different values of ρ xz = ρ xy = ρ yz = ρ.